{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Trying to predict number of COVID-19 tests by August 1st 2020\n",
    "*Notebook by Steven Rybicki*\n",
    "\n",
    "In this notebook, we'll be focusing on trying to predict the number of COVID-19 tests in the US. Namely, we'll be looking at forecasting the answer to:\n",
    "> By August 1, how many tests for COVID-19 will have been administered in the US?\n",
    "\n",
    "taken from the [Metaculus question](https://pandemic.metaculus.com/questions/4400/by-august-1-how-many-tests-for-covid-19-will-have-been-administered-in-the-us/). This just looks at number of tests, not distinguishing by type or if someone is tested multiple times.\n",
    "\n",
    "To do this, we're going to be using [Ergo](https://github.com/oughtinc/ergo), a library by [Ought](https://ought.org/). This lets you integrate model based forecasting (building up a numerical model to predict outcomes) with judgement based forecasting (using calibration to predict outcomes less formally). To see other notebooks using Ergo, see [the Ergo repo](https://github.com/oughtinc/ergo/tree/master/notebooks). \n",
    "\n",
    "I'll be trying to walk through how you'd use this realistically if you wanted to forecast the answer to a question without investing a lot of time on exploration. This means our model will be kind of rough, but hopefully accurate enough to provide value, and I'll be iteratively trying to improve it as we go along. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Notebook setup\n",
    "\n",
    "This imports libraries, and sets up some useful functions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%capture\n",
    "!pip install --progress-bar off poetry\n",
    "!pip install --progress-bar off git+https://github.com/oughtinc/ergo.git@e9f6463de652f4e15d7c9ab0bb9f067fef8847f7"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "import ssl\n",
    "warnings.filterwarnings(action=\"ignore\", category=FutureWarning)\n",
    "warnings.filterwarnings(action=\"ignore\", module=\"plotnine\")\n",
    "ssl._create_default_https_context = ssl._create_unverified_context"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import ergo\n",
    "import seaborn\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "from datetime import timedelta, date, datetime\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "pd.set_option('precision', 2)\n",
    "\n",
    "def summarize_samples(samples):\n",
    "    \"\"\"\n",
    "    Print out the p5, p50 and p95 of the given sample\n",
    "    \"\"\"\n",
    "    stats = samples.describe(percentiles=[0.05, 0.5, 0.95])\n",
    "    percentile = lambda pt: float(stats.loc[f\"{pt}%\"])\n",
    "    return f\"{percentile(50):.2f} ({percentile(5):.2f} to {percentile(95):.2f})\"\n",
    "\n",
    "def show_marginal(func):\n",
    "    \"\"\"\n",
    "    Use Ergo to generate 1000 samples of the distribution of func, and then plot them as a distribution. \n",
    "    \"\"\"\n",
    "    samples = ergo.run(func, num_samples=1000)[\"output\"]\n",
    "    seaborn.distplot(samples).set_title(func.__doc__);\n",
    "    plt.show()\n",
    "    print(f\"Median {func.__doc__}: {summarize_samples(samples)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How are we going to predict this?\n",
    "\n",
    "## Principles\n",
    "\n",
    "Here's a couple things we're going to try to do when predicting this:\n",
    "\n",
    "- *fermi estimate / decomposition*: focus on decomposing the question into multiple parts that we can estimate separately. We'll start with our overall question (how many tests) and decompose it into smaller questions that are easier to predict (e.g. what's the estimated number of tests we will make tomorrow?).\n",
    "- *use live data*: where possible, fetch data from an up to date source. This lets us easily update our guess, as we can just rerun the notebook and it will update our prediction based on updated data. \n",
    "- *wisdom of the crowds*: incorporate other predictions, namely the [Metaculus one](https://pandemic.metaculus.com/questions/4400/by-august-1-how-many-tests-for-covid-19-will-have-been-administered-in-the-us/), so we can include information we might have missed.\n",
    "\n",
    "## Question decomposition\n",
    "\n",
    "Let's decompose our question into a number of sub questions. We'll then focus on trying to estimate each of these separately, allowing us to iteratively improve our model as we go along.\n",
    "\n",
    "So the number of tests done by August 1st could be decomposed as:\n",
    " - ensemble prediction of:\n",
    "     - my prediction\n",
    "        - current number of tests done to date: how many tests have we done up until today?\n",
    "        - how much we can expect that number to change until August 1st?\n",
    "            - if current rates remain unchanged, what will that number be in August?\n",
    "            - what change in testing rates should we expect based on past changes?\n",
    "            - what will be the impact of future factors on testing rates?\n",
    "     - metaculus question "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# My Prediction\n",
    "\n",
    "## Get relevant data\n",
    "\n",
    "To know the current testing rate, and how it changes, we're going to use the https://covidtracking.com/ API to fetch the testing statistics. To make it easier to parse, we're going to filter to the fields we care about:\n",
    "- date\n",
    "- totalTestResults: culmulative tests per day\n",
    "- totalTestResultsIncrease: how many new test results were reported that day"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "testing_data = pd.read_csv(\"https://covidtracking.com/api/v1/us/daily.csv\")\n",
    "# testing data uses numbers for dates (20200531 for e.g.), so let's convert that to use python dates instead\n",
    "testing_data[\"date\"] = testing_data[\"date\"].apply(lambda d: datetime.strptime(str(d),  \"%Y%m%d\"))\n",
    "\n",
    "# now filter just to the columns we care about\n",
    "testing_data = testing_data[[\"date\", \"totalTestResults\", \"totalTestResultsIncrease\"]]\n",
    "testing_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Current number of tests\n",
    "\n",
    "We can now use our dataset to easily find the number of tests done today (or the latest in the dataset), which we can verify is correct by looking at the table above"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# pick the largest, therefore most up to date, date\n",
    "current_date = testing_data[\"date\"].max()\n",
    "\n",
    "# if there's multiple entries for a day for whatever reason, take the first\n",
    "current_test_number = testing_data.loc[testing_data[\"date\"] == current_date][\"totalTestResults\"][0]\n",
    "\n",
    "current_date, current_test_number"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How much can we expect this to change until August 1st?\n",
    "\n",
    "We now want to predict how many new tests we can expect from now until August 1st. One easy way to do this:\n",
    "- look at the distribution of new tests per day over the last month\n",
    "- if we simulate what happens if we continue like this until August 1st, what distribution do we expect?\n",
    "\n",
    "### Distribution of tests over the past month\n",
    "\n",
    "First, let's look at the `totalTestResultsIncrease` over the past month."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "last_month = current_date - timedelta(days=30)\n",
    "test_data_over_last_month = testing_data[testing_data['date'].between(last_month, current_date)]\n",
    "\n",
    "\n",
    "seaborn.distplot(test_data_over_last_month[\"totalTestResultsIncrease\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, we want to use this to help build a model. Since we're using Ergo, this is easy: we just create a function that represents this field, which returns a random sample from the dataset. Then, to simulate the distribution of this field, we can sample from this function a large number of times. This is what the `show_marginal` function does, as well as plotting the resulting distribution."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_results_increase_per_day():\n",
    "    \"# test results per day\"\n",
    "    return ergo.random_choice(list(test_data_over_last_month[\"totalTestResultsIncrease\"]))\n",
    "\n",
    "show_marginal(test_results_increase_per_day)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As a sanity check, this distribution looks very similar as plotting the distribution above, which makes sense as they should be equivalent since the above is just randomly sampling from the original.\n",
    "\n",
    "Right now, this isn't especially useful: we've just created a similar distribution to one we had originally. But now that we've created this in Ergo, it becomes easy to manipulate and combine this with other distributions. We'll be doing this to extrapolate how many tests we can expect by August."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## How much we can expect that to increase by August 1st?\n",
    "\n",
    "### Repeating our current rates\n",
    "Now, using our new `test_results_increase_per_day` function, can we extrapolate what the number of tests tomorrow could look like? One way to do this: \n",
    "- look at test results today, `current_test_number`\n",
    "- sample from our `test_results_increase_per_day` distribution, add it to how many we have so far\n",
    "\n",
    "Since `test_results_increase_per_day` returns a sample from this distribution, this just looks like adding `current_test_number` to `test_results_increase_per_day`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_results_tomorrow():\n",
    "    \"# test results tomorrow\"\n",
    "    return current_test_number + test_results_increase_per_day()\n",
    "\n",
    "show_marginal(test_results_tomorrow)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Can we extend this to extrapolate further in the future? One simple way to do this (which we'll refine later): call `test_results_increase_per_day()` for each day in the future we want to forecast, and add them together. We could also just do `totalTestResultsIncrease() * number_of_days`, but this approach lets us sample more points and better approximate the potential increase. \n",
    "\n",
    "To verify this is working in a sensible way, let's test it for tomorrow:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_results_for_date(date: datetime):\n",
    "    number_of_days = (date - current_date).days\n",
    "    return current_test_number + sum(test_results_increase_per_day() for _ in range(number_of_days))\n",
    "\n",
    "def test_results_for_date_tomorrow():\n",
    "    \"# test results tomorrow using test_results_for_date\"\n",
    "    return test_results_for_date(current_date + timedelta(days=1))\n",
    "\n",
    "show_marginal(test_results_for_date_tomorrow)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "At least in my notebook, these show the exact same answer. So, now, if we want to predict for August, we can just change the date given."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_results_for_deadline():\n",
    "    \"# test results for August 1st\"\n",
    "    return test_results_for_date(datetime(2020, 8, 1))\n",
    "\n",
    "show_marginal(test_results_for_deadline)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that with Ergo, we can just use normal python operators and functions (`+, sum`) to sample from `test_results_increase_per_day()` and create this new distribution `test_results_for_date`. This lets us focus on building our model, and not have to worry about the math of combining distributions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What change in testing rates should we expect based on past changes?\n",
    "\n",
    "The above analysis assumes that number of tests per day is not increasing, as if it is we'll be underestimating. Let's validate that, and look at how the number of tests has changed over time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# create a days column, since that will help us make the x-axis look cleaner \n",
    "# (dates overlap with a smaller graph)\n",
    "first_date = testing_data[\"date\"].min()\n",
    "testing_data[\"days\"] = \\\n",
    "    testing_data[\"date\"].apply(lambda d: (d - first_date).days)\n",
    "\n",
    "seaborn.lineplot(x=\"days\", y=\"totalTestResultsIncrease\", data=testing_data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So we can see that roughly, month over month, the number of tests per day are increasing linearly. This means that our `test_results_for_deadline()` model above is probably underestimating the number of tests, as it's assuming the testing capacity remains static. \n",
    "\n",
    "What we want is to estimate the day over day increase, and use that to increase the number of tests our model is adding per day. Since it looks very linear right now, let's use a linear regression to estimate the slope of increases. We'll look at data when we past 100 000 cases, as that's where the trend seems to become linear."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# ignore early testing data, as it was mostly zero, then growing exponentially\n",
    "test_data_worth_looking_at = testing_data[testing_data['totalTestResults'] >= 100000]\n",
    "\n",
    "# do a linear regression using scipy\n",
    "from scipy import stats\n",
    "slope_of_test_increases = stats.linregress(test_data_worth_looking_at[\"days\"], test_data_worth_looking_at[\"totalTestResultsIncrease\"]).slope\n",
    "\n",
    "slope_of_test_increases"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, with this in hand, we can modify our previous estimate by using this slope. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_results_for_date_with_slope(date: datetime):\n",
    "    number_of_days = (date - current_date).days\n",
    "    return current_test_number + sum(test_results_increase_per_day() + slope_of_test_increases * day for day in range(number_of_days))\n",
    "\n",
    "\n",
    "def test_results_for_deadline_with_slope():\n",
    "    \"# test results for August 1st with linear increase in tests\"\n",
    "    return test_results_for_date_with_slope(datetime(2020, 8, 1))\n",
    "\n",
    "show_marginal(test_results_for_deadline_with_slope)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What will be the impact of future factors on testing rates?\n",
    "Right now, our model is potentially a bit too overconfident. Testing might not increase linearly at this rate forever:\n",
    "- we might increase testing rate, because of:\n",
    "    - changes in testing policy to increase number of tests\n",
    "    - outbreaks that intensify in regions\n",
    "    - tests that are cheaper \n",
    "    - more infrastructure to deploy tests\n",
    "    - etc.\n",
    "- we might decrease testing rate, because of:\n",
    "    - changes in testing policy to decrease the number of tests, to try understate the impact of the disease\n",
    "    - containment strategies working well\n",
    "    - hitting capacity limits on test manufacture\n",
    "    - not being able to import any tests due to shortages in other countries\n",
    "    - etc.\n",
    "\n",
    "We could go down the rabbit hole of trying to predict each of these, but that would take a ton of time, and not necessarily change the prediction much. The cheaper time-wise thing to do is to try estimate how much we expect testing rates to change in the future. This is using the judgement based forecasting we mentioned above. \n",
    "\n",
    "I'm going to guess that linear growth is still possible until August, but that it could be anything from 0.25 to 2 current growth per day. we're going to approximate this with a lognormal distribution, as we think that the most likely rate is around 1, but don't want to discount values at the tails."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def estimated_slope():\n",
    "    \"# estimated slope\"\n",
    "    return ergo.lognormal_from_interval(0.5, 2) * slope_of_test_increases\n",
    "\n",
    "show_marginal(estimated_slope)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can adapt our previous model, just using this estimated slope instead. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_results_for_date_with_slope(date: datetime):\n",
    "    # The slightly more correct way of doing this would be to replace estimated_slope() * day with repeated calls\n",
    "    # to estimated_slope. But as we're predicing kind of far into the future, this makes the simulation really slow.\n",
    "    number_of_days = (date - current_date).days\n",
    "    return current_test_number + sum(test_results_increase_per_day() + estimated_slope() * day for day in range(number_of_days))\n",
    "\n",
    "\n",
    "def test_results_for_deadline_with_slope():\n",
    "    \"# test results for August 1st with linear increase in tests\"\n",
    "    return test_results_for_date_with_slope(datetime(2020, 8, 1))\n",
    "\n",
    "show_marginal(test_results_for_deadline_with_slope)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Incorporating the Metaculus prediction\n",
    "\n",
    "Now that we have our own prediction, let's look at the Metaculus one. This will give us a sense of how our model stacks up to their community's model, and is a good sanity check to see if we missed something obvious. If you're running this notebook, remember to replace the credentials below with your own."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "metaculus = ergo.Metaculus(username=\"oughtpublic\", password=\"123456\", api_domain=\"pandemic\")\n",
    "testing_question = metaculus.get_question(4400, name=\"# tests administered\")\n",
    "testing_question.show_community_prediction()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is similar to our prediction, just less certain (longer tails) and centered slightly higher. Since we think they have valuable information we might have missed, we're going to create a simple essemble prediction: pick randomly between a prediction the Metaculus community would make, and we would make, to create a combined model.\n",
    "\n",
    "Seaborn at time of writing has difficult plotting a distribution plot for this, because of some large outliers in the sample, so let's use a boxplot instead. This also helps show how to use Ergo outside of our `show_marginal` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def ensemble_prediction():\n",
    "    \"# ensemble prediction\"\n",
    "    if ergo.flip(0.5):\n",
    "        return testing_question.sample_community()\n",
    "    else:\n",
    "        return test_results_for_deadline_with_slope()\n",
    "\n",
    "# use ergo to create a distribution from 1000 samples, using our ensemble prediction\n",
    "# this is output as a pandas dataframe, making it easy to plot and manipulate\n",
    "samples = ergo.run(ensemble_prediction, num_samples=1000)[\"output\"]\n",
    "\n",
    "# show them as a box plot\n",
    "seaborn.boxplot(samples)\n",
    "print(\"Median:\", summarize_samples(samples))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This shows a distribution similar to the one we had, just less certain, and with some probability for the long tails. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Submitting to metaculus\n",
    "\n",
    "If we want to, we can now submit this model to Metaculus."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# below commented out to avoid accidentally submitting an answer when you don't mean to\n",
    "# samples = ergo.run(ensemble_prediction, num_samples=1000)[\"output\"]\n",
    "# testing_question.submit_from_samples(samples)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Summary\n",
    "To create this model, we did the following:\n",
    "- we fetched the latest data from https://covidtracking.com/\n",
    "- estimated the growth of the number of tests to be a combination of the past linear growth, and a log-normal distribution\n",
    "- used that to extrapolate the number of tests we'd have in August\n",
    "- then combined that prediction with the metaculus prediction, to create a final forecast for number of tests on August 1st\n",
    "\n",
    "Since we did a lot of iteration on our model, I'm including a cleaned up version of the entire thing below, so you can see everything at once."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Covid19TestsModel(object):\n",
    "    def __init__(self, testing_data, testing_question):\n",
    "        self.testing_data = testing_data\n",
    "        self.testing_question = testing_question\n",
    "\n",
    "        self.last_month = current_date - timedelta(days=30)\n",
    "        self.current_date = testing_data[\"date\"].max()\n",
    "        # if there's multiple entries for a day for whatever reason, take the first\n",
    "        self.current_test_number = testing_data.loc[testing_data[\"date\"] == current_date][\"totalTestResults\"][0]\n",
    "        \n",
    "        # data subsets\n",
    "        self.test_data_over_last_month = self.testing_data[testing_data['date'].between(last_month, current_date)]\n",
    "\n",
    "        # calculate slope\n",
    "        test_data_worth_looking_at = testing_data[testing_data['totalTestResults'] >= 100000]\n",
    "        self.slope_of_test_increases = stats.linregress(\n",
    "            test_data_worth_looking_at[\"days\"], \n",
    "            test_data_worth_looking_at[\"totalTestResultsIncrease\"]).slope\n",
    "        \n",
    "    def test_results_increase_per_day(self):\n",
    "        \"\"\"\n",
    "        Estimated test increase over the past day looking at increases over the last month\n",
    "        \"\"\"\n",
    "        return ergo.random_choice(list(self.test_data_over_last_month[\"totalTestResultsIncrease\"]))\n",
    "    \n",
    "    def estimated_slope(self):\n",
    "        \"\"\"\n",
    "        Estimated slope of increase of tests per day looking at linear regression of test cases,\n",
    "        and a log-normal prediction of the possible changes\n",
    "        \"\"\"\n",
    "        return ergo.lognormal_from_interval(0.5, 2) * self.slope_of_test_increases\n",
    "    \n",
    "    def test_results_for_date_with_slope(self, date: datetime):\n",
    "        \"\"\"\n",
    "        Estimated test results for date, estimating based on the number of estimated test results per day\n",
    "        including the estimated rate of increase\n",
    "        \"\"\"\n",
    "        number_of_days = (date - self.current_date).days\n",
    "        return self.current_test_number + \\\n",
    "            sum(self.test_results_increase_per_day() + self.estimated_slope() * day for day in range(number_of_days))\n",
    "\n",
    "\n",
    "    def test_results_for_deadline_with_slope(self):\n",
    "        return self.test_results_for_date_with_slope(datetime(2020, 8, 1))\n",
    "    \n",
    "    def ensemble_prediction(self):\n",
    "        \"# ensemble prediction\"\n",
    "        if ergo.flip(0.5):\n",
    "            return self.testing_question.sample_community()\n",
    "        else:\n",
    "            return self.test_results_for_deadline_with_slope()\n",
    "\n",
    "show_marginal(Covid19TestsModel(testing_data, testing_question).ensemble_prediction)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This hopefully shows why I like Ergo: most of the code above is setting up the data. The actual building of the model, both extrapolating from the data and incorporating my estimates, is pretty succinct and (hopefully) simple.\n",
    "\n",
    "If you want to make your own models, see the [Ergo repo](https://github.com/oughtinc/ergo) for instructions on how ot get started, or just look at [some existing notebooks](https://github.com/oughtinc/ergo/tree/master/notebooks). "
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "cell_metadata_filter": "-all",
   "main_language": "python",
   "notebook_metadata_filter": "-all"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
